89 research outputs found
Recommended from our members
Robust prediction of clinical outcomes using cytometry data.
MotivationFlow cytometry and mass cytometry are widely used to diagnose diseases and to predict clinical outcomes. When associating clinical features with cytometry data, traditional analysis methods require cell gating as an intermediate step, leading to information loss and susceptibility to batch effects. Here, we wish to explore an alternative approach that predicts clinical features from cytometry data without the cell-gating step. We also wish to test if such a gating-free approach increases the accuracy and robustness of the prediction.ResultsWe propose a novel strategy (CytoDx) to predict clinical outcomes using cytometry data without cell gating. Applying CytoDx on real-world datasets allow us to predict multiple types of clinical features. In particular, CytoDx is able to predict the response to influenza vaccine using highly heterogeneous datasets, demonstrating that it is not only accurate but also robust to batch effects and cytometry platforms.Availability and implementationCytoDx is available as an R package on Bioconductor (bioconductor.org/packages/CytoDx). Data and scripts for reproducing the results are available on bitbucket.org/zichenghu_ucsf/cytodx_study_code/downloads.Supplementary informationSupplementary data are available at Bioinformatics online
Processing of Electronic Health Records using Deep Learning: A review
Availability of large amount of clinical data is opening up new research
avenues in a number of fields. An exciting field in this respect is healthcare,
where secondary use of healthcare data is beginning to revolutionize
healthcare. Except for availability of Big Data, both medical data from
healthcare institutions (such as EMR data) and data generated from health and
wellbeing devices (such as personal trackers), a significant contribution to
this trend is also being made by recent advances on machine learning,
specifically deep learning algorithms
Recommended from our members
Accuracy of medical billing data against the electronic health record in the measurement of colorectal cancer screening rates.
ObjectiveMedical billing data are an attractive source of secondary analysis because of their ease of use and potential to answer population-health questions with statistical power. Although these datasets have known susceptibilities to biases, the degree to which they can distort the assessment of quality measures such as colorectal cancer screening rates are not widely appreciated, nor are their causes and possible solutions.MethodsUsing a billing code database derived from our institution's electronic health records, we estimated the colorectal cancer screening rate of average-risk patients aged 50-74 years seen in primary care or gastroenterology clinic in 2016-2017. 200 records (150 unscreened, 50 screened) were sampled to quantify the accuracy against manual review.ResultsOut of 4611 patients, an analysis of billing data suggested a 61% screening rate, an estimate that matches the estimate by the Centers for Disease Control. Manual review revealed a positive predictive value of 96% (86%-100%), negative predictive value of 21% (15%-29%) and a corrected screening rate of 85% (81%-90%). Most false negatives occurred due to examinations performed outside the scope of the database-both within and outside of our institution-but 21% of false negatives fell within the database's scope. False positives occurred due to incomplete examinations and inadequate bowel preparation. Reasons for screening failure include ordered but incomplete examinations (48%), lack of or incorrect documentation by primary care (29%) including incorrect screening intervals (13%) and patients declining screening (13%).ConclusionsBilling databases are prone to substantial bias that may go undetected even in the presence of confirmatory external estimates. Caution is recommended when performing population-level inference from these data. We propose several solutions to improve the use of these data for the assessment of healthcare quality
Knowledge-Augmented Contrastive Learning for Abnormality Classification and Localization in Chest X-rays with Radiomics using a Feedback Loop
Building a highly accurate predictive model for classification and
localization of abnormalities in chest X-rays usually requires a large number
of manually annotated labels and pixel regions (bounding boxes) of
abnormalities. However, it is expensive to acquire such annotations, especially
the bounding boxes. Recently, contrastive learning has shown strong promise in
leveraging unlabeled natural images to produce highly generalizable and
discriminative features. However, extending its power to the medical image
domain is under-explored and highly non-trivial, since medical images are much
less amendable to data augmentations. In contrast, their prior knowledge, as
well as radiomic features, is often crucial. To bridge this gap, we propose an
end-to-end semi-supervised knowledge-augmented contrastive learning framework,
that simultaneously performs disease classification and localization tasks. The
key knob of our framework is a unique positive sampling approach tailored for
the medical images, by seamlessly integrating radiomic features as a knowledge
augmentation. Specifically, we first apply an image encoder to classify the
chest X-rays and to generate the image features. We next leverage Grad-CAM to
highlight the crucial (abnormal) regions for chest X-rays (even when
unannotated), from which we extract radiomic features. The radiomic features
are then passed through another dedicated encoder to act as the positive sample
for the image features generated from the same chest X-ray. In this way, our
framework constitutes a feedback loop for image and radiomic modality features
to mutually reinforce each other. Their contrasting yields knowledge-augmented
representations that are both robust and interpretable. Extensive experiments
on the NIH Chest X-ray dataset demonstrate that our approach outperforms
existing baselines in both classification and localization tasks.Comment: Accepted by WACV 202
Recommended from our members
ROMOP: a light-weight R package for interfacing with OMOP-formatted electronic health record data.
Objectives:Electronic health record (EHR) data are increasingly used for biomedical discoveries. The nature of the data, however, requires expertise in both data science and EHR structure. The Observational Medical Out-comes Partnership (OMOP) common data model (CDM) standardizes the language and structure of EHR data to promote interoperability of EHR data for research. While the OMOP CDM is valuable and more attuned to research purposes, it still requires extensive domain knowledge to utilize effectively, potentially limiting more widespread adoption of EHR data for research and quality improvement. Materials and methods:We have created ROMOP: an R package for direct interfacing with EHR data in the OMOP CDM format. Results:ROMOP streamlines typical EHR-related data processes. Its functions include exploration of data types, extraction and summarization of patient clinical and demographic data, and patient searches using any CDM vocabulary concept. Conclusion:ROMOP is freely available under the Massachusetts Institute of Technology (MIT) license and can be obtained from GitHub (http://github.com/BenGlicksberg/ROMOP). We detail instructions for setup and use in the Supplementary Materials. Additionally, we provide a public sandbox server containing synthesized clinical data for users to explore OMOP data and ROMOP (http://romop.ucsf.edu)
Recommended from our members
Protected Health Information filter (Philter): accurately and securely de-identifying free-text clinical notes.
There is a great and growing need to ascertain what exactly is the state of a patient, in terms of disease progression, actual care practices, pathology, adverse events, and much more, beyond the paucity of data available in structured medical record data. Ascertaining these harder-to-reach data elements is now critical for the accurate phenotyping of complex traits, detection of adverse outcomes, efficacy of off-label drug use, and longitudinal patient surveillance. Clinical notes often contain the most detailed and relevant digital information about individual patients, the nuances of their diseases, the treatment strategies selected by physicians, and the resulting outcomes. However, notes remain largely unused for research because they contain Protected Health Information (PHI), which is synonymous with individually identifying data. Previous clinical note de-identification approaches have been rigid and still too inaccurate to see any substantial real-world use, primarily because they have been trained with too small medical text corpora. To build a new de-identification tool, we created the largest manually annotated clinical note corpus for PHI and develop a customizable open-source de-identification software called Philter ("Protected Health Information filter"). Here we describe the design and evaluation of Philter, and show how it offers substantial real-world improvements over prior methods
- …